SRIUBC-Core: Multiword Soft Similarity Models for Textual Similarity
نویسنده
چکیده
In this year’s Semantic Textual Similarity evaluation, we explore the contribution of models that provide soft similarity scores across spans of multiple words, over the previous year’s system. To this end, we explored the use of neural probabilistic language models and a TF-IDF weighted variant of Explicit Semantic Analysis. The neural language model systems used vector representations of individual words, where these vectors were derived by training them against the context of words encountered, and thus reflect the distributional characteristics of their usage. To generate a similarity score between spans, we experimented with using tiled vectors and Restricted Boltzmann Machines to identify similar encodings. We find that these soft similarity methods generally outperformed our previous year’s systems, albeit they did not perform as well in the overall rankings. A simple analysis of the soft similarity resources over two word phrases is provided, and future areas of improvement are described.
منابع مشابه
SRIUBC: Simple Similarity Features for Semantic Textual Similarity
We describe the systems submitted by SRI International and the University of the Basque Country for the Semantic Textual Similarity (STS) SemEval-2012 task. Our systems focused on using a simple set of features, featuring a mix of semantic similarity resources, lexical match heuristics, and part of speech (POS) information. We also incorporate precision focused scores over lexical and POS infor...
متن کاملMayoClinicNLP-CORE: Semantic representations for textual similarity
The Semantic Textual Similarity (STS) task examines semantic similarity at a sentencelevel. We explored three representations of semantics (implicit or explicit): named entities, semantic vectors, and structured vectorial semantics. From a DKPro baseline, we also performed feature selection and used sourcespecific linear regression models to combine our features. Our systems placed 5th, 6th, an...
متن کاملSOFTCARDINALITY-CORE: Improving Text Overlap with Distributional Measures for Semantic Textual Similarity
The soft cardinality proved to be a very strong text-overlapping baseline for the task of semantic-textual-similarity (STS) obtaining the third place in SemEval-2012. This year, besides to the plain text-overlapping approach, two distributional word-similarity functions derived from the ukWack corpus were tested within the soft cardinality. These measures contributed to improve the performance ...
متن کاملProbabilistic Soft Logic for Semantic Textual Similarity
Probabilistic Soft Logic (PSL) is a recently developed framework for probabilistic logic. We use PSL to combine logical and distributional representations of natural-language meaning, where distributional information is represented in the form of weighted inference rules. We apply this framework to the task of Semantic Textual Similarity (STS) (i.e. judging the semantic similarity of naturallan...
متن کاملNew distance and similarity measures for hesitant fuzzy soft sets
The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...
متن کامل